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TITLE 

Client-server based subscriber characterization system 

Background of the Invention 

Subscribers face an increasingly large number of choices 
for entertainment programming, which is delivered over networks 
such as cable TV systems, over-the-air broadcast systems, and 
switched digital access systems which use telephone company 
twisted wire pairs for the deliver of signals. 

Cable television service providers have typically provided 
one-way broadcast services but now offer high-speed data 
services and can combine traditional analog broadcasts with 
digital broadcasts and access to Internet web sites. Telephone 
companies can offer digital data and video programming on a 
switched basis over digital subscriber line technology. 
Although the subscriber may only be presented with one channel 
at a time, channel change requests are instantaneously 
transmitted to centralized switching equipment and the 
subscriber can access the programming in a broadcast-like 
manner. Internet Service Providers (ISPs) offer Internet 
access and can offer access to text, audio, and video 
programming which can also be delivered in a broadcast-like 
manner in which the subscriber selects "channels" containing 
programming of interest. Such channels may be offered as part 
of a video programming service or within a data service and can 
be presented within an Internet browser. 

Along with the multitude of programming choices which the 
subscriber faces, subscribers are subject to advertisements, 
which in many cases subsidize or pay for the entire cost of the 
programming. While advertisements are sometimes beneficial to 
subscribers and deliver desired information regarding specific 
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products or serviW, consumers generally viewadvertising as a 
"necessary evil" for broadcast type entertainment. 

In order to deliver more targeted programming and 
advertising to subscribers, it is necessary to understand their 
5 likes and dislikes to a greater extent than is presently done 
today. Systems which identify subscriber preferences based on 
their purchases and responses to questionnaires allow for the 
targeted marketing of literature in the mail, but do not in any 
sense allow for the rapid and precise delivery of programming 

10 and advertising which is known to have a high probability of 
acceptance to the subscriber. In order to determine which 
programming or advertising is appropriate for the subscriber, 
knowledge of that subscriber and the subscriber product and 
programming preferences is required. 

15 Specific information regarding a subscriber's viewing 

habits or the Internet web sites they have accessed can be 
stored for analysis, but such records are considered private 
and subscribers are not generally willing to have such 
information leave their control. Although there are regulatory 

20 models which permit the collection of such data on a "notice 
and consent" basis, there is a general tendency towards legal 
rules which prohibit such raw data to be collected. 

With the migration of services from a broadcast based 
model to a client-server based model in which subscribers make 

25 individualized request for programming to an Internet access 
provider or content provider, there is opportunity to monitor 
the subscriber viewing characteristics to better provide them 
with programming and advertising which will be of interest to 
them. A server may act as a proxy for the subscriber requests 

30 and thus be able to monitor what a subscriber has requested and 
is viewing. Since subscribers may not want this raw data to be 

2. 
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utilized, there ilK need for a system whicl^can process this 
information and generate statistically relevant subscriber 
profiles. These profiles should be accessible to others on the 
network who may wish to determine if their programming or 
advertisements are suitable for the subscriber. 

For the foregoing reasons, there is a need for a 
subscriber characterization system which can generate and store 
subscriber characteristics which reflect the probable 
demographics and preferences of the subscriber and household. 



Summary Of The Invention 

The present invention includes a system for characterizing 
subscribers watching video or multimedia programming based on 
monitoring the requests made by the subscriber for programming 

15 to a server which contains the content or which requests the 
content from a third party. The server side of the network is 
able to monitor the subscriber's detailed selection choices 
including the time duration of their viewing, the volume the 
programming is listened at, and the program selection. 

20 The server side collects text information about that 

programming to determine what type of programming the 
subscriber is most interested in. In addition the system can 
generate a demographic description of the subscriber or 
household which describes the probable age, income, gender and 

25 other demographics. The resulting characterization includes 
probabilistic determinations of what other programming or 
products the subscriber /household will be interested in. 

In a preferred embodiment the textual information which 
describes the programming is obtained by context mining of text 

30 associated with the programming. The associated text can be 
from the closed-captioning data associated with the 
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programming, an eJ^Ptronic program guide, or from text files 
associated with or part of the programming itself. 

The system can provide both session measurements which 
correspond to a profile obtained over a viewing session, or an 
5 average profile which corresponds to data obtained over 
multiple viewing sessions. 

The present invention also encompasses the use of 
heuristic rules in logical form or expressed as conditional 
probabilities to aid in forming a subscriber profile. The 

10 heuristic rules in logical form allow the system to apply 

generalizations which have been learned from external studies 
to obtain a characterization of the subscriber. In the case of 
conditional probabilities, determinations of the probable 
content of a program can be applied in a mathematical step to a 

15 matrix of conditional probabilities to obtain probabilistic 
subscriber profiles indicating program and product likes and 
dislikes as well for determining probabilistic demographic 
data. 

One advantage of the present invention is that it allows 
20 consumers the possibility to permit access to probabilistic 
information regarding their household demographics and 
programming/product preferences, without revealing their 
specific viewing history. Subscribers may elect to permit 
access to this information in order to receive advertising 
25 which is more targeted to their likes/dislikes. Similarly, a 
subscriber may wish to sell access to this statistical data in 
order to receive revenue or receive a discount on a product or 
a service. 

Another advantage of the present invention is that the 
3 0 resulting probabilistic information can be stored locally and 
controlled by the subscriber, or can be transferred to a third 
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de access to the subscri^^ 



party which can pi 
characterization. The information can also be encrypted to 
prevent unauthorized access in which case only the subscriber 
of someone authorized by the subscriber can access the data. 
5 These and other features and objects of the invention will 

be more fully understood from the following detailed 
description of the preferred embodiments which should be read 
in light of the accompanying drawings. 

10 Brief Description of the Drawings 

The accompanying drawings, which are incorporated in and 
form a part of the specification, illustrate the embodiments of 
the present invention and, together with the description serve 
to explain the principles of the invention. 
15 In the drawings: 

FIG. 1 shows a context diagram for a subscriber 
characterization system. 

FIG. 2 illustrates a block diagram for a realization of a 
subscriber monitoring system for receiving video signals; 
20 FIG. 3 illustrates a block diagram of a channel processor; 

FIG. 4 illustrates a block diagram of a computer for a 
realization of the subscriber monitoring system; 

FIG. 5 illustrates a channel sequence and volume over a 
twenty- four (24) hour period; 
25 FIG. 6 illustrates a time of day detailed record; 

FIG. 7 illustrates a household viewing habits statistical 
table; 

FIG. 8A illustrates an entity-relationship diagram for the 
generation of program characteristics vectors; 
30 FIG. 8B illustrates a flowchart for program 

characterization; 
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FIGS. 9A iJ^PItrates a deterministic program category 
vector; 

FIG. 9B illustrates a deterministic program sub-category 
vector; 

5 FIG. 9C illustrates a deterministic program rating vector; 

FIG. 9D illustrates a probabilistic program category 
vector; 

FIG. 9E illustrates a probabilistic program sub-category 
vector; 

10 FIG. 9F illustrates a probabilistic program content 

vector; 

FIG. 10A illustrates a set of logical heuristic rules; 

FIG. 10B illustrates a set of heuristic rules expressed in 
terms of conditional probabilities; 
15 FIG. 11 illustrates an entity- relationship diagram for the 

generation of program demographic vectors; 

FIG. 12 illustrates a program demographic vector; 

FIG. 13 illustrates an entity-relationship diagram for the 
generation of household session demographic data and household 
20 session interest profiles; 

FIG. 14 illustrates an entity- relationship diagram for the 
generation of average and session household demographic 
characteristics ; 

FIG. 15 illustrates average and session household 
25 demographic data; 

FIG. 16 illustrates an entity-relationship diagram for 
generation of a household interest profile; 

FIG. 17 illustrates household interest profile including 
programming and product profiles; and 
30 FIG. 18 illustrates a client-server architecture for 

realizing the present invention. 

6. 
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Detailed Description 
Of The Preferred Embodiment 



5 In describing a preferred embodiment of the invention 

illustrated in the drawings, specific terminology wili be used 
for the sake of clarity. However, the invention is not 
intended to be limited to the specific terms so selected, and 
it is to be understood that each specific term includes all 

10 technical equivalents which operate in a similar manner to 
accomplish a similar purpose. 

With reference to the drawings, in general, and FIGS. 
1 through 18 in particular, the apparatus of the present 
invention is disclosed. 

15 The present invention is directed at an apparatus for 

generating a subscriber profile which contains useful 
information regarding the subscriber likes and dislikes. Such 
a profile is useful for systems which provide targeted 
programming or advertisements to the subscriber, and allow 

20 material (programs or advertisements) to be directed at 

subscribers who will have a high probability of liking the 
program or a high degree of interest in purchasing the product. 

Since there are typically multiple individuals in a 
household, the subscriber characterization may not be a 

25 characterization of an individual subscriber but may instead be 
a household average. When used herein, the term subscriber 
refers both to an individual subscriber as well as the average 
characteristics of a household of multiple subscribers. 

In the present system the programming viewed by the 

30 subscriber, both entertainment and advertisement, can be 

studied and processed by the subscriber characterization system 

7. 
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to determine the 




ram characteristics. TlvSs determination 



of the program characteristics is referred to as a program 
characteristics vector. The vector may be a truly one- 
dimensional vector, but can also be represented as an n 
5 dimensional matrix which can be decomposed into vectors. 

The subscriber profile vector represents a profile of the 
subscriber (or the household of subscribers) and can be in the 
form of a demographic profile (average or session) or a program 
or product preference vector. The program and product 
10 preference vectors are considered to be part of a household 
interest profile which can be thought of as an n dimensional 
matrix representing probabilistic measurements of subscriber 
interests . 



15 demographic profile, the subscriber profile vector indicates a 
probabilistic measure of the age of the subscriber or average 
age of the viewers in the household, sex of the subscriber, 
income range of the subscriber or household, and other such 
demographic data. Such information comprises household 

20 demographic characteristics and is composed of both average and 
session values. Extracting a single set of values from the 
household demographic characteristics can correspond to a 
subscriber profile vector. 



25 programming and product profiles, with programming profiles 
corresponding to probabilistic determinations of what 
programming the subscriber (household) is likely to be 
interested in, and product profiles corresponding to what 
products the subscriber (household) is likely to be interested 

30 in. These profiles contain both an average value and a session 
value, the average value being a time average of data, where 



In the case that the subscriber profile vector is a 



The household interest profile can contain both 
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may be several days, wee' 




months, or the 



time between resets of unit. 

Since a viewing session is likely to be dominated by a 
particular viewer, the session values may, in some 
5 circumstances, correspond most closely to the subscriber 

values, while the average values may, in some circumstances, 
correspond most closely to the household values. 



10 context diagram, in combination with entity-relationship 



realize the present invention. The present invention can be 
realized in a number of programming languages including C, C++, 
Perl, and Java, although the scope of the invention is not 

15 limited by the choice of a particular programming language or 
tool. Object oriented languages have several advantages in 
terms of construction of the software used to realize the 
present invention, although the present invention can be 
realized in procedural or other types of programming languages 

2 0 known to those skilled in the art. 

In generating a subscriber profile, the SCS 100 receives 
from a user 120 commands in the form of a volume control signal 
124 or program selection data 122 which can be in the form of a 
channel change but may also be an address request which 

25 requests the delivery of programming from a network address. A 
record signal 126 indicates that the programming or the address 
of the programming is being recorded by the user. The record 
signal 126 can also be a printing command, a tape recording 
command, a bookmark command or any other command intended to 

30 store the program being viewed, or program address, for later 
use . 



FIG. 1 depicts the context diagram of a preferred 
embodiment of a Subscriber Characterization System (SCS) 100. A 



diagrams, provide a basis from which one skilled in the art can 
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. ^ffcng viewed by the user 12^jL 



The material wing viewed by the user 120is referred to 
as source material 130. The source material 130, as defined 
herein, is the content that a subscriber selects and may 
consist of analog video, Motion Picture Expert Group (MPEG) 
5 digital video source material, other digital or analog 

material, Hypertext Markup Language (HTML) or other type of 
multimedia source material. The subscriber characterization 
system 100 can access the source material 130 received by the 
user 120 using a start signal 132 and a stop signal 134, which 

10 control the transfer of source related text 136 which can be 
analyzed as described herein. 

In a preferred embodiment, the source related text 13 6 can 
be extracted from the source material 130 and stored in memory. 
The source related text 136, as defined herein, includes source 

15 related textual information including descriptive fields which 
are related to the source material 130, or text which is part 
of the source material 13 0 itself. The source related text 13 6 
can be derived from a number of sources including but not 
limited to closed captioning information, Electronic Program 

20 Guide (EPG) material, and text information in the source itself 
(e.g. text in HTML files) . 

Electronic Program Guide (EPG) 140 contains information 
related to the source material 130 which is useful to the user 
120. The EPG 140 is typically a navigational tool which 

25 contains source related information including but not limited 
to the programming category, program description, rating, 
actors, and duration. The structure and content of EPG data is 
described in detail in US Patent 5,596,373 assigned to Sony 
Corporation and Sony Electronics which is herein incorporated 

3 0 by reference. As shown in FIG. 1, the EPG 140 can be accessed 
by the SCS 100 by a request EPG data signal 142 which results 

10. 
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in the return of iJ^tegory 144, a sub-categc^^ 146, and a 
program description 148. 

In one embodiment of the present invention, EPG data is 
accessed and program information such as the category 144, the 
5 sub-category 146, and the program description 148 are stored in 
memory . 

In another embodiment of the present invention, the source 
related text 13 6 is the closed captioning text embedded in the 
analog or digital video signal. Such closed captioning text can 

10 be stored in memory for processing to extract the program 
characteristic vectors 150. 

One of the functions of the SCS 100 is to generate the 
program characteristics vectors 150 which are comprised of 
program characteristics data 152, as illustrated in FIG. 1. The 

15 program characteristics data 152, which can be used to create 
the program characteristics vectors 150 both in vector and 
table form, are examples of source related information which 
represent characteristics of the source material. In a 
preferred embodiment, the program characteristics vectors 150 

20 are lists of values which characterize the programming (source) 
material in according to the category 144, the sub-category 
146, and the program description 148. The present invention may 
also be applied to advertisements, in which case program 
characteristics vectors contain, as an example, a product 

25 category, a product sub-category, and a brand name. 

As illustrated in FIG. 1, the SCS 100 uses heuristic rules 
160. The heuristic rules 160, as described herein, are 
composed of both logical heuristic rules as well as heuristic 
rules expressed in terms of conditional probabilities. The 

30 heuristic rules 160 can be accessed by the SCS 100 via a 

request rules signal 162 which results in the transfer of a 
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copy of rules 164 



the SCS 100. 



The SCS 100 forms program demographic vectors 170 from 
program demographics 172, as illustrated in FIG. 1. The 
program demographic vectors 170 also represent characteristics 
5 of source related information in the form of the intended or 
expected demographics of the audience for which the source 
material is intended. 

Subscriber selection data 110 is obtained from the 
monitored activities of the user and in a preferred embodiment 

10 can be stored in a dedicated memory. In an alternate 

embodiment, the subscriber selection data 110 is stored in a 
storage disk. Information which is utilized to form the 
subscriber selection data 110 includes time 112, which 
corresponds to the time of an event, channel ID 114, program ID 

15 116, volume level 118, channel change record 119, and program 
title 117. A detailed record of selection data is illustrated 
in FIG. 6. 

In a preferred embodiment, a household viewing habits 195 
illustrated in FIG. 1 is computed from the subscriber selection 

20 data 110. The SCS 100 transfers household viewing data 197 to 
form household viewing habits 195. The household viewing data 
197 is derived from the subscriber selection data 110 by 
looking at viewing habits at a particular time of day over an 
extended period of time, usually several days or weeks, and 

25 making some generalizations regarding the viewing habits during 
that time period. 

The program characteristics vector 150 is derived from the 
source related text 136 and/ or from the EPG 140 by applying 
information retrieval techniques. The details of this process 

30 are discussed in accordance with FIG. 8. 



The program characteristics vector 150 is used in 
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combination with SPet of the heuristic rules^bO to define a 
set of the program demographic vectors 170 illustrated in FIG. 
1 describing the audience the program is intended for. 

One output of the SCS 100 is a household profile including 
5 household demographic characteristics 190 and a household 

interest profile 180. The household demographic characteristics 
190 resulting from the transfer of household demographic data 
192, and the household interest profile 180, resulting from the 
transfer of household interests data 182 . Both the household 

10 demographics characteristics 190 and the household interest 

profile 180 have a session value and an average value, as will 
be discussed herein. 

The monitoring system depicted in FIG. 2 is responsible 
for monitoring the subscriber activities, and can be used to 

15 realize the SCS 100. In a preferred embodiment, the monitoring 
system of FIG. 2 is located in a television set- top device or 
in the television itself. In an alternate embodiment, the 
monitoring system is part of a computer which receives 
programming from a network. 

20 In an application of the system for television services, 

an input connector 220 accepts the video signal coming either 
from an antenna, cable television input, or other network. The 
video signal can be analog or Digital MPEG. Alternatively, the 
video source may be a video stream or other multimedia stream 

25 from a communications network including the Internet. 

In the case of either analog or digital video, selected 
fields are defined to carry EPG data or closed captioning text. 
For analog video, the closed captioning text is embedded in the 
vertical blanking interval (VBI) . As described in US Patent 

30 5,579,005, assigned to Scientific-Atlanta, Inc., the EPG 

information can be carried in a dedicated channel or embedded 
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in the VBI . 



For 



F ital video, the closed captioning text is 




carried as video user bits in a user_data field. The EPG data 
is transmitted as ancillary data and is multiplexed at the 
transport layer with the audio and video data. 



commands from the user 120, decodes the command and forwards 
the command to the destined module. In a preferred embodiment, 
the commands are entered via a remote control to a remote 
receiver 205 or a set of selection buttons 207 available at the 
10 front panel of the system control unit 200. In an alternate 
embodiment, the commands are entered by the user 12 0 via a 
keyboard . 

The system control unit 2 00 also contains a Central 
Processing Unit (CPU) 203 for processing and supervising all of 

15 the operations of the system control unit 200, a Read Only 
Memory (ROM) 202 containing the software and fixed data, a 
Random Access Memory (RAM) 204 for storing data. CPU 203, RAM 
204, ROM 202, and I/O controller 201 are attached to a master 
bus 206. A power supply in a form of battery can also be 

20 included in the system control unit 200 for backup in case of 
power outage. 

An input/output (I/O) controller 201 interfaces the system 
control unit 200 with external devices. In a preferred 
embodiment, the I/O controller 201 interfaces to the remote 

25 receiver 2 05 and a selection button such as the channel change 
button on a remote control. In an alternate embodiment, it can 
accept input from a keyboard or a mouse. 

The program selection data 122 is forwarded to a channel 
processor 210. The channel processor 210 tunes to a selected 

30 channel and the media stream is decomposed into its basic 

components: the video stream, the audio stream, and the data 



5 



Referring to FIG. 2, a system control unit 200 receives 
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stream. The video 



earn is directed to a video processor 




module 230 where it is decoded and further processed for 
display to the TV screen. The audio stream is directed to an 
audio processor 240 for decoding and output to the speakers. 

The data stream can be EPG data, closed captioning text, 
Extended Data Service (EDS) information, a combination of 
these, or an alternate type of data. In the case of EDS the 
call sign, program name and other useful data are provided. In 
a preferred embodiment, the data stream is stored in a reserved 
location of the RAM 204. In an alternate embodiment, a magnetic 
disk is used for data storage. The system control' unit 200 
writes also in a dedicated memory, which in a preferred 
embodiment is the RAM 204, the selected channel, the time 112 
of selection, the volume level 118 and the program ID 116 and 
the program title 117. Upon receiving the program selection 
data 122, the new selected channel is directed to the channel 
processor 210 and the system control unit 200 writes to the 
dedicated memory the channel selection end time and the program 
title 117 at the time 112 of channel change. The system control 
unit 200 keeps track of the number of channel changes occurring 
during the viewing time via the channel change record 119. This 
data forms part of the subscriber selection data 110. 

The volume control signal 124 is sent to the audio 
processor 240. In a preferred embodiment, the volume level 118 
selected by the user 120 corresponds to the listening volume. 
In an alternate embodiment, the volume level 118 selected by 
the user 12 0 represents a volume level to another piece of 
equipment such as an audio system (home theatre system) or to 
the television itself. In such a case, the volume can be 
measured directly by a microphone or other audio sensing device 
which can monitor the volume at which the selected source 
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material is being 



tened. 



A program change occurring while watching a selected 
channel is also logged by the system control unit 200. 
Monitoring the content of the program at the time of the 
program change can be done by reading the content of the EDS. 
The EDS contains information such as program title, which is 
transmitted via the VBI. A change on the program title field is 
detected by the monitoring system and logged as an event. In an 
alternate embodiment, an EPG is present and program information 
can be extracted from the EPG. In a preferred embodiment, the 
programming data received from the EDS or EPG permits 
distinguishing between entertainment programming and 
advertisements . 

FIG. 3 shows the block diagram of the channel processor 
210. In a preferred embodiment, the input connector 220 
connects to a tuner 3 00 which tunes to the selected channel. A 
local oscillator can be used to heterodyne the signal to the IF 
signal. A demodulator 302 demodulates the received signal and 
the output is fed to an FEC decoder 304. The data stream 
received from the FEC decoder 3 04 is, in a preferred 
embodiment, in an MPEG format. In a preferred embodiment, 
system demultiplexer 306 separates out video and audio 
information for subsequent decompression and processing, as 
well as ancillary data which can contain program related 
information . 

The data stream presented to the system demultiplexer 306 
consists of packets of data including video, audio and 
ancillary data. The system demultiplexer 306 identifies each 
packet from the stream ID and directs the stream to the 
corresponding processor. The video data is directed to the 
video processor module 23 0 and the audio data is directed to 
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the audio process 



f^2 



40. The ancillary data c 




contain closed 



captioning text, emergency messages, program guide, or other 
useful information . 

Closed captioning text is considered to be ancillary data 
5 and is thus contained in the video stream. The system 

demultiplexer 306 accesses the user data field of the video 
stream to extract the closed captioning text. The program 
guide, if present, is carried on data stream identified by a 
specific transport program identifier. 

10 In an alternate embodiment, analog video can be used. For 

analog programming, ancillary data such as closed captioning 
text or EDS data are carried in a vertical blanking interval. 

FIG. 4 shows the block diagram of a computer system for a 
realization of the subscriber monitoring system based on the 

15 reception of multimedia signals from a bi-directional network. 
A system bus 422 transports data amongst the CPU 203, the RAM 
204, Read Only Memory - Basic Input Output System (ROM-BIOS) 
406 and other components. The CPU 203 accesses a hard drive 400 
through a disk controller 402. The standard input/output 

20 devices are connected to the system bus 422 through the I/O 
controller 201. A keyboard is attached to the I/O controller 
201 through a keyboard port 416 and the monitor is connected 
through a monitor port 418. The serial port device uses a 
serial port 420 to communicate with the I/O controller 201. 

25 Industry Standard Architecture (ISA) expansion slots 408 and 
Peripheral Component Interconnect (PCI) expansion slots 410 
allow additional cards to be placed into the computer. In a 
preferred embodiment, a network card is available to interface 
a local area, wide area, or other network. 

30 FIG. 5 illustrates a channel sequence and volume over a 

twenty- four (24) hour period. The Y-axis represents the status 
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of the receiver i 



rms of on/off status and^Polume level. The 




X-axis represents the time of day. The channels viewed are 
represented by the windows 501-506, with a first channel 502 
being watched followed by the viewing of a second channel 504, 
5 and a third channel 506 in the morning. In the evening a 

fourth channel 501 is watched, a fifth channel 503, and a sixth 
channel 505. A channel change is illustrated by a momentary 
transition to the "off" status and a volume change is 
represented by a change of level on the Y-axis. 

10 A detailed record of the subscriber selection data 110 is 

illustrated in FIG. 6 in a table format. A time column 602 
contains the starting time of every event occurring during the 
viewing time. A Channel ID column 604 lists the channels 
viewed or visited during that period. A program title column 

15 603 contains the titles of all programs viewed. A volume column 
601 contains the volume level 118 at the time 112 of viewing a 
selected channel. 

A representative statistical record corresponding to the 
household viewing habits 195 is illustrated in FIG. 7 . In a 

20 preferred embodiment, a time of day column 700 is organized in 
period of time including morning, mid-day, afternoon, night, 
and late night. In an alternate embodiment, smaller time 
periods are used. A minutes watched column 702 lists, for each 
period of time, the time in minutes in which the SCS 100 

2 5 recorded delivery of programming. The number of channel changes 
during that period and the average volume are also included in 
that table in a channel changes column 704 and an average 
volume column 706 respectively. The last row of the statistical 
record contains the totals for the items listed in the minutes 

30 watched column 702, the channel changes column 704 and the 
average volume 706. 
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FIG. 8A illu^Wates an entity-relationship) diagram for the 
generation of the program characteristics vector 150. The 
context vector generation and retrieval technique described in 
US Patent 5,619,709, which is incorporated herein by reference, 
5 can be applied for the generation of the program 

characteristics vectors 150. Other techniques are well known by 
those skilled in the art. 

Referring to FIG. 8A, the source material 13 0 or the EPG 
140 are passed through a program characterization process 800 

10 to generate the program characteristics vectors 150. The 

program characterization process 800 is described in accordance 
with FIG. 8B. Program content descriptors including a first 
program content descriptor 802, a second program content 
descriptor 804 and an nth program content descriptor 806, each 

15 classified in terms of the category 144, the sub-category 146, 
and other divisions as identified in the industry accepted 
program classification system, are presented to a context 
vector generator 82 0. As an example, the program content 
descriptor can be text representative of the expected content 

20 of material found in the particular program category 144. In 
this example, the program content descriptors 802, 804 and 806 
would contain text representative of what would be found in 
programs in the news, fiction, and advertising categories 
respectively. The context vector generator 820 generates 

25 context vectors for that set of sample texts resulting in a 
first summary context vector 808, a second summary context 
vector 810, and an nth summary context vector 812. In the 
example given, the summary context vectors 808, 810, and 812 
correspond to the categories of news, fiction and advertising 

3 0 respectively. The summary vectors are stored in a local data 
storage system. 
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) . 8B, a sample of the source 



Referring to^»G. 8B, a sample of the source related text 
13 6 which is associated with the new program to be classified 
is passed to the context vector generator 820 which generates a 
program context vector 840 for that program. The source related 
5 text 13 6 can be either the source material 130, the EPG 140, or 
other text associated with the source material . A comparison is 
made between the actual program context vectors and the stored 
program content context vectors by computing, in a dot product 
computation process 830, the dot product of the first summary 
10 context vector 808 with the program context vector 840 to 
produce a first dot product 814. Similar operations are 
performed to produce second dot product 816 and nth dot product 
818. 

The values contained in the dot products 814, 816 and 818, 

15 while not probabilistic in nature, can be expressed in 

probabilistic terms using a simple transformation in which the 
result represents a confidence level of assigning the 
corresponding content to that program. The transformed values 
add up to one. The dot products can be used to classify a 

20 program, or form a weighted sum of classifications which 

results in the program characteristics vectors 150. In the 
example given, if the source related text 13 6 was from an 
advertisement, the nth dot product 818 would have a high value, 
indicating that the advertising category was the most 

25 appropriate category, and assigning a high probability value to 
that category. If the dot products corresponding to the other 
categories were significantly higher than zero, those 
categories would be assigned a value, with the result being the 
program characteristics vectors 150 as shown in FIG. 9D. 

30 For the sub-categories, probabilities obtained from the 

content pertaining to the same sub-category 146 are summed to 
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L^l^ for the new program beii^^j 



form the probabilMr for the new program bein^in that sub- 
category 146. At the sub-category level, the same method is 
applied to compute the probability of a program being from the 
given category 144. The three levels of the program 
5 classification system; the category 144, the sub-category 146 
and the content, are used by the program characterization 
process 800 to form the program characteristics vectors 150 
which are depicted in FIGS. 9D-9F. 

The program characteristics vectors 150 in general are 

10 represented in FIGS. 9A through 9F. FIGS. 9A, 9B and 9C are an 
example of deterministic program vectors. This set of vectors 
is generated when the program characteristics are well defined, 
as can occur when the source related text 13 6 or the EPG 140 
contains specific fields identifying the category 144 and the 

15 sub-category 146. A program rating can also provided by the EPG 
140. 

In the case that these characteristics are not specified, 
a statistical set of vectors is generated from the process 
described in accordance with FIG. 8. FIG. 9D shows the 

20 probability that a program being watched is from the given 

category 144. The categories are listed in the X-axis. The sub- 
category 146 is also expressed in terms of probability. This is 
shown in FIG. 9E. The content component of this set of vectors 
is a third possible level of the program classification, and is 

25 illustrated in FIG. 9F. 

FIG. 10A illustrates sets of logical heuristics rules 
which form part of the heuristic rules 160. In a preferred 
embodiment, logical heuristic rules are obtained from 
sociological or psychological studies. Two types of rules are 

30 illustrated in FIG. 10A. The first type links an individual's 
viewing characteristics to demographic characteristics such as 
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gender , age , and 



ft ] 



►me level . A channel changing rate rule 




103 0 attempts to determine gender based on channel change rate. 
An income related channel change rate rule 1010 attempts to 
link channel change rates to income brackets. A second type of 
5 rules links particular programs to particular audience, as 
illustrated by a gender determining rule 1050 which links the 
program category 144 /sub-category 146 with a gender. The 
result of the application of the logical heuristic rules 
illustrated in FIG. 10A are probabilistic determinations of 

10 factors including gender, age, and income level. Although a 
specific set of logical heuristic rules has been used as an 
example, a wide number of types of logical heuristic rules can 
be used to realize the present invention. In addition, these 
rules can be changed based on learning within the system or 

15 based on external studies which provide more accurate rules. 

FIG. 10B illustrates a set of the heuristic rules 160 
expressed in terms of conditional probabilities. In the 
example shown in FIG. 10B, the category 144 has associated with 
it conditional probabilities for demographic factors such as 

20 age, income, family size and gender composition. The category 
144 has associated with it conditional probabilities that 
represent probability that the viewing group is within a 
certain age group dependent on the probability that they are 
viewing a program in that category 144. 

25 FIG. 11 illustrates an entity-relationship diagram for the 

generation of the program demographic vectors 170. In a 
preferred embodiment, the heuristic rules 160 are applied along 
with the program characteristic vectors 150 in a program target 
analysis process 1100 to form the program demographic vectors 

30 170. The program characteristic vectors 150 indicate a 

particular aspect of a program, such as its violence level. The 



22. 




T703 



heuristic rules 1 



# 



ndicate that a particula^^emographic 




group has a preference for that program. As an example, it may 
be the case that young males have a higher preference for 
violent programs than other sectors of the population. Thus, a 
5 program which has the program characteristic vectors 150 

indicating a high probability of having violent content, when 
combined with the heuristic rules 160 indicating that u young 
males like violent programs," will result, through the program 
target analysis process 1100, in the program demographic 

10 vectors 170 which indicate that there is a high probability 
that the program is being watched by a young male. 

The program target analysis process 1100 can be realized 
using software programmed in a variety of languages which 
processes mathematically the heuristic rules 160 to derive the 

15 program demographic vectors 170. The table representation of 
the heuristic rules 160 illustrated in FIG. 10B expresses the 
probability that the individual or household is from a specific 
demographic group based on a program with a particular category 
144. This can be expressed, using probability terms as follow 

2 0 "the probability that the individuals are in a given 

demographic group conditional to the program being in a given 
category". Referring to FIG. 9D, the probability that the group 
has certain demographic characteristics based on the program 
being in a specific category is illustrated. 

25 Expressing the probability that a program is destined to 

a specific demographic group can be determined by applying 
Bayes rule. This probability is the sum of the conditional 
probabilities that the demographic group likes the program, 
conditional to the category 144 weighted by the probability 

30 that the program is from that category 144. In a preferred 
embodiment, the program target analysis can calculate the 



23. 




T703 



program demograph 



ectors by application of^ogical heuristic 




rules, as illustrated in FIG. 10A, and by application of 
heuristic rules expressed as conditional probabilities as shown 
in FIG. 10B. Logical heuristic rules can be applied using 
5 logical programming and fuzzy logic using techniques well 
understood by those skilled in the art, and are discussed in 
the text by S. V. Kartalopoulos entitled "Understanding Neural 
Networks and Fuzzy Logic" which is incorporated herein by 
reference . 

10 Conditional probabilities can be applied by simple 

mathematical operations multiplying program context vectors by 
matrices of conditional probabilities. By performing this 
process over all the demographic groups, the program target 
analysis process 1100 can measure how likely a program is to be 

15 of interest to each demographic group. Those probabilities 

values form the program demographic vector 170 represented in 
FIG. 12. 

As an example, the heuristic rules expressed as 
conditional probabilities shown in FIG. 10B are used as part of 

20 a matrix multiplication in which the program characteristics 
vector 150 of dimension N, such as those shown in FIGS. 9A-9F 
is multiplied by an N x M matrix of heuristic rules expressed 
as conditional probabilities, such as that shown in FIG. 10B. 
The resulting vector of dimension M is a weighted average of 

25 the conditional probabilities for each category and represents 
the household demographic characteristics 190. Similar 
processing can be performed at the sub-category and content 
levels . 



30 vector 170, and shows the extent to which a particular program 
is destined to a particular audience. This is measured in terms 



FIG. 12 illustrates an example of the program demographic 
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of probability as 




acted in FIG. 12. The Y-axis is the 



probability of appealing to the demographic group identified on 
the X-axis. 

FIG. 13 illustrates an entity- relationship diagram for the 
5 generation of household session demographic data 1310 and 
household session interest profile 1320. In a preferred 
embodiment, the subscriber selection data 110 is used along 
with the program characteristics vectors 150 in a session 
characterization process 1300 to generate the household session 

10 interest profile 1320. The subscriber selection data 110 

indicates what the subscriber is watching, for how long and at 
what volume they are watching the program. 

In a preferred embodiment, the session characterization 
process 13 00 forms a weighted average of the program 

15 characteristics vectors 150 in which the time duration the 

program is watched is normalized to the session time (typically 
defined as the time from which the unit was turned on to the 
present) . The program characteristics vectors 150 are 
multiplied by the normalized time duration (which is less than 

20 one unless only one program has been viewed) and summed with 
the previous value. Time duration data, along with other 
subscriber viewing information, is available from the 
subscriber selection data 110. The resulting weighted average 
of program characteristics vectors forms the household session 

25 interest profile 1320, with each program contributing to the 
household session interest profile 1320 according to how long 
it was watched. The household session interest profile 1320 is 
normalized to produce probabilistic values of the household 
programming interests during that session. 

30 In an alternate embodiment, the heuristic rules 160 are 

applied to both the subscriber selection data 110 and the 
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program charactero^^cs vectors 150 to genera^^the household 
session demographic data 1310 and the household session 
interest profile 1320. In this embodiment, weighted averages 
of the program characteristics vectors 150 are formed based on 
5 the subscriber selection data 110, and the heuristic rules 160 
are applied. In the case of logical heuristic rules as shown 
in FIG. 10A, logical programming can be applied to make 
determinations regarding the household session demographic data 
1310 and the household session interest profile 1320. In the 

10 case of heuristic rules in the form of conditional 

probabilities such as those illustrated in FIG. 10B, a dot 
product of the time averaged values of the program 
characteristics vectors can be taken with the appropriate 
matrix of heuristic rules to generate both the household 

15 session demographic data 1310 and the household session 
interest profile 1320. 

Volume control measurements which form part of the 
subscriber selection data 110 can also be applied in the 
session characterization process 13 00 to form a household 

20 session interest profile 1320. This can be accomplished by 
using normalized volume measurements in a weighted average 
manner similar to how time duration is used. Thus, muting a 

r 

show results in a zero value for volume, and the program 
characteristics vector 150 for this show will not be averaged 

25 into the household session interest profile 1320. 

FIG. 14 illustrates an entity-relationship diagram for the 
generation of average household demographic characteristics and 
session household demographic characteristics 190. A household 
demographic characterization process 1400 generates the 

30 household demographic characteristics 190 represented in table 
format in FIG. 15. The household demographic characterization 
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process 1400 uses^e household viewing habit^^95 in 

combination with the heuristic rules 160 to determine 
demographic data. For example, a household with a number of 
minutes watched of zero during the day may indicate a household 
5 with two working adults. Both logical heuristic rules as well 
as rules based on conditional probabilities can be applied to 
the household viewing habits 195 to obtain the household 
demographics characteristics 190. 

The household viewing habits 195 is also used by the 

10 system to detect out-of -habits events. For example, if a 

household with a zero value for the minutes watched column 702 
at late night presents a session value at that time via the 
household session demographic data 1310, this session will be 
characterized as an out-of -habits event and the system can 

15 exclude such data from the average if it is highly probable 
that the demographics for that session are greatly different 
than the average demographics for the household. Nevertheless, 
the results of the application of the household demographic 
characterization process 1400 to the household session 

20 demographic data 1310 can result in valuable session 

demographic data, even if such data is not added to the average 
demographic characterization of the household. 

FIG. 15 illustrates the average and session household 
demographic characteristics. A household demographic parameters 

25 column 1501 is followed by an average value column 1505, a 
session value column 1503, and an update column 1507. The 
average value column 1505 and the session value column 1503 are 
derived from the household demographic characterization process 
1400. The deterministic parameters such as address and 

30 telephone numbers can be obtained from an outside source or can 
be loaded into the system by the subscriber or a network 
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operator at the t 



of installation. Updating 




f deterministic 



values is prevented by indicating that these values should not 
be updated in the update column 1507. 

FIG. 16 illustrates an entity-relationship diagram for the 
generation of the household interest profile 180 in a household 
interest profile generation process 1600. In a preferred 
embodiment, the household interest profile generation process 
comprises averaging the household session interest profile 1320 
over multiple sessions and applying the household viewing 
habits 195 in combination with the heuristic rules 160 to form 
the household interest profile 180 which takes into account 
both the viewing preferences of the household as well as 
assumptions about households /subscribers with those viewing 
habits and program preferences . 

FIG. 17 illustrates the household interest profile 180 
which is composed of a programming types row 17 09, a products 
types row 1707, and a household interests column 1701, an 
average value column 1703, and a session value column 1705. 

The product types row 1707 gives an indication as to what 
type of advertisement the household would be interested in 
watching, thus indicating what types of products could 
potentially be advertised with a high probability of the 
advertisement being watched in its entirety. The programming 
types row 1709 suggests what kind of programming the household 
is likely to be interested in watching. The household interests 
column 1701 specifies the types of programming and products 
which are statistically characterized for that household. 

As an example of the industrial applicability of the 
invention, a household will perform its normal viewing routine 
without being requested to answer specific questions regarding 
likes and dislikes. Children may watch television in the 
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morning in the ho 



old, and may change chanr^.s during 




commercials, or not at all. The television may remain off 
during the working day, while the children are at school and 
day care, and be turned on again in the evening, at which time 
5 the parents may "surf" channels, mute the television during 

commercials, and ultimately watch one or two hours of broadcast 
programming. The present invention provides the ability to 
characterize the household, and may make the determination that 
there are children and adults in the household, with program 

10 and product interests indicated in the household interest 

profile 180 corresponding to a family of that composition. A 
household with two retired adults will have a completely 
different characterization which will be indicated in the 
household interest profile 180. 

15 Although the present invention has been largely described 

in the context of a single computing platform receiving 
programming, the SCS 100 can be realized as part of a client- 
server architecture, as illustrated in FIG. 18. Referring to 
FIG. 18, residence 1800 contains a personal computer (PC) 1820 

20 as well as the combination of a television 1810 and a set-top 
1808, which can request and receive programming. The equipment 
in residence 1800, or similar equipment in a small or large 
business environment, forms the client side of the network as 
defined herein. Programming is delivered over an access 

25 network 1830, which may be a cable television network, 
telephone type network, or other access network. Information 
requests are made by the client side to .a server 1840 which 
forms the server side of the network. Server 1840 has content 
locally which it provides to the subscriber, or requests 

30 content on behalf of the subscriber from a third party content 
provider 1860, as illustrated in FIG. 18. Requests made on 
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side by server 1840 are i?5de across a wide 

area network 1850 which can be the Internet or other public or 

private network. Techniques for making requests on behalf of a 

client are frequently referred to a proxy techniques and are 

5 well known to those skilled in the art. The server side 

receives the requested programming which is displayed on PC 

1820 or television 1810 according to which device made the 

request . 

In a preferred embodiment the server 1840 maintains the 
10 subscriber selection data 110 which it is able to compile based 
on its operation as a proxy for the client side. Retrieval of 
source related information and the program target analysis 
process 1100, the program characterization process 800, the 
program target analysis process 1100, the session 
15 characterization process 1300, the household demographic 
characterization process 1400, and the household interest 
profile generation process 1600 can be performed by server 
1840. 

As an example of the industrial applicability of the 
20 invention, a household will perform its normal viewing routine 
without being requested to answer specific questions regarding 
likes and dislikes. Children may watch television in the 
morning in the household, and may change channels during 
commercials, or not at all. The television may remain off 
25 during the working day, while the children are at school and 
day care, and be turned on again in the evening, at which time 
the parents may "surf" channels, mute the television during 
commercials, and ultimately watch one or two hours of broadcast 
programming. The present invention provides the ability to 
30 characterize the household, and may make the determination that 
there are children and adults in the household, with program 
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ts indicated in the hRTsehold interest 



profile 180 corresponding to a family of that composition. A 
household with two retired adults will have a completely 
different characterization which will be indicated in household 
5 interest profile 180, which is stored at server 1840 and can be 
accessed by various groups wishing to provide programming and 
advertisements to the members of residence 1800. 

Although this invention has been illustrated by reference 
to specific embodiments, it will be apparent to those skilled 
10 in the art that various changes and modifications may be made 
which clearly fall within the scope of the invention. The 
invention is intended to be protected broadly within the spirit 
and scope of the appended claims. 
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